Goto

Collaborating Authors

 internal state


A Quantifiable Information-Processing Hierarchy Provides a Necessary Condition for Detecting Agency

Kagan, Brett J., Baccetti, Valentina, Earp, Brian D., Boyd, J. Lomax, Savulescu, Julian, Razi, Adeel

arXiv.org Machine Learning

As intelligent systems are developed across diverse substrates - from machine learning models and neuromorphic hardware to in vitro neural cultures - understanding what gives a system agency has become increasingly important. Existing definitions, however, tend to rely on top-down descriptions that are difficult to quantify. We propose a bottom-up framework grounded in a system's information-processing order: the extent to which its transformation of input evolves over time. We identify three orders of information processing. Class I systems are reactive and memoryless, mapping inputs directly to outputs. Class II systems incorporate internal states that provide memory but follow fixed transformation rules. Class III systems are adaptive; their transformation rules themselves change as a function of prior activity. While not sufficient on their own, these dynamics represent necessary informational conditions for genuine agency. This hierarchy offers a measurable, substrate-independent way to identify the informational precursors of agency. We illustrate the framework with neurophysiological and computational examples, including thermostats and receptor-like memristors, and discuss its implications for the ethical and functional evaluation of systems that may exhibit agency.


Globally Convergent Policy Search for Output Estimation

Neural Information Processing Systems

We introduce the first direct policy search algorithm which provably converges to the globally optimal dynamic filter for the classical problem of predicting the outputs of a linear dynamical system, given noisy, partial observations. Despite the ubiquity of partial observability in practice, theoretical guarantees for direct policy search algorithms, one of the backbones of modern reinforcement learning, have proven difficult to achieve. This is primarily due to the degeneracies which arise when optimizing over filters that maintain an internal state. In this paper, we provide a new perspective on this challenging problem based on the notion of informativity, which intuitively requires that all components of a filter's internal state are representative of the true state of the underlying dynamical system. We show that informativity overcomes the aforementioned degeneracy. Specifically, we propose a regularizer which explicitly enforces informativity, and establish that gradient descent on this regularized objective - combined with a "reconditioning step" - converges to the globally optimal cost at a $O(1/T)$ rate.


Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Neural Information Processing Systems

Prior work on differential privacy analysis of randomized SGD algorithms relies on composition theorems, where the implicit (unrealistic) assumption is that the internal state of the iterative algorithm is revealed to the adversary. As a result, the R\'enyi DP bounds derived by such composition-based analyses linearly grow with the number of training epochs. When the internal state of the algorithm is hidden, we prove a converging privacy bound for noisy stochastic gradient descent (on strongly convex smooth loss functions). We show how to take advantage of privacy amplification by sub-sampling and randomized post-processing, and prove the dynamics of privacy bound for sample without replacement'' stochastic mini-batch gradient descent schemes. We prove that, in these settings, our privacy bound converges exponentially fast and is substantially smaller than the composition bounds, notably after a few number of training epochs. Thus, unless the DP algorithm converges fast, our privacy analysis shows that hidden state analysis can significantly amplify differential privacy.


Defining the Scope of Learning Analytics: An Axiomatic Approach for Analytic Practice and Measurable Learning Phenomena

Takii, Kensuke, Liang, Changhao, Ogata, Hiroaki

arXiv.org Machine Learning

Learning Analytics (LA) has rapidly expanded through practical and technological innovation, yet its foundational identity has remained theoretically under-specified. This paper addresses this gap by proposing the first axiomatic theory that formally defines the essential structure, scope, and limitations of LA. Derived from the psychological definition of learning and the methodological requirements of LA, the framework consists of five axioms specifying discrete observation, experience construction, state transition, and inference. From these axioms, we derive a set of theorems and propositions that clarify the epistemological stance of LA, including the inherent unobservability of learner states, the irreducibility of temporal order, constraints on reachable states, and the impossibility of deterministically predicting future learning. We further define LA structure and LA practice as formal objects, demonstrating the sufficiency and necessity of the axioms and showing that diverse LA approaches -- such as Bayesian Knowledge Tracing and dashboards -- can be uniformly explained within this framework. The theory provides guiding principles for designing analytic methods and interpreting learning data while avoiding naive behaviorism and category errors by establishing an explicit theoretical inference layer between observations and states. This work positions LA as a rigorous science of state transition systems based on observability, establishing the theoretical foundation necessary for the field's maturation as a scholarly discipline.


CERNet: Class-Embedding Predictive-Coding RNN for Unified Robot Motion, Recognition, and Confidence Estimation

Sawada, Hiroki, Pitti, Alexandre, Quoy, Mathias

arXiv.org Artificial Intelligence

Robots interacting with humans must not only generate learned movements in real-time, but also infer the intent behind observed behaviors and estimate the confidence of their own inferences. This paper proposes a unified model that achieves all three capabilities within a single hierarchical predictive-coding recurrent neural network (PC-RNN) equipped with a class embedding vector, CERNet, which leverages a dynamically updated class embedding vector to unify motor generation and recognition. The model operates in two modes: generation and inference. In the generation mode, the class embedding constrains the hidden state dynamics to a class-specific subspace; in the inference mode, it is optimized online to minimize prediction error, enabling real-time recognition. Validated on a humanoid robot across 26 kinesthetically taught alphabets, our hierarchical model achieves 76% lower trajectory reproduction error than a parameter-matched single-layer baseline, maintains motion fidelity under external perturbations, and infers the demonstrated trajectory class online with 68% Top-1 and 81% Top-2 accuracy. Furthermore, internal prediction errors naturally reflect the model's confidence in its recognition. This integration of robust generation, real-time recognition, and intrinsic uncertainty estimation within a compact PC-RNN framework offers a compact and extensible approach to motor memory in physical robots, with potential applications in intent-sensitive human-robot collaboration.


Scaling Internal-State Policy-Gradient Methods for POMDPs

Aberdeen, Douglas, Baxter, Jonathan

arXiv.org Artificial Intelligence

Policy-gradient methods have received increased attention recently as a mechanism for learning to act in partially observable environments. They have shown promise for problems admitting memoryless policies but have been less successful when memory is required. In this paper we develop several improved algorithms for learning policies with memory in an infinite-horizon setting -- directly when a known model of the environment is available, and via simulation otherwise. We compare these algorithms on some large POMDPs, including noisy robot navigation and multi-agent problems.


Real-Time Procedural Learning From Experience for AI Agents

Bi, Dasheng, Hu, Yubin, Nasir, Mohammed N.

arXiv.org Artificial Intelligence

Learning how to do things from trial and error in real time is a hallmark of biological intelligence, yet most LLM-based agents lack mechanisms to acquire procedural knowledge after deployment. We propose Procedural Recall for Agents with eXperiences Indexed by State (PRAXIS), a lightweight post-training learning mechanism that stores the consequences of actions and retrieves them by jointly matching environmental and internal states of past episodes to the current state. PRAXIS augments agentic action selection with retrieved state-action-result exemplars that are generated in real time. When evaluated on the REAL web browsing benchmark, PRAXIS improves task completion accuracy, reliability, and cost efficiency across different foundation model backbones, and shows preliminary generalization to unseen tasks in similar environments. These results demonstrate that PRAXIS enables the practical adoption of AI agents in fast-evolving stateful environments by helping them learn new procedures effectively.


Exploring Syntropic Frameworks in AI Alignment: A Philosophical Investigation

Spizzirri, Austin

arXiv.org Artificial Intelligence

The alignment problem--ensuring advanced AI systems act in accordance with human values-- represents one of the most pressing philosophical and technical challenges of our time. As Bostrom (2014) and Russell (2019) have argued, the difficulty lies not merely in creating capable systems, but in ensuring these systems remain beneficial as their capabilities grow. Current approaches typically attempt to specify human values directly, whether through reward modeling, constitutional AI, or iterative refinement based on human feedback. Yet these content-based approaches face a fundamental philosophical problem: human values are contextual, often contradictory, and resist precise specification. The attempt to encode a complete value system encounters what I call the "specification trap"--the more precisely we attempt to define our values, the more we realize their dependence on implicit knowledge, cultural context, and evolutionary history that cannot be fully articulated. This paper's central thesis is that alignment should be reconceived not as a problem of value specification but as one of process architecture: creating syntropic, reasons-responsive agents whose values emerge through embodied multi-agent interaction rather than being encoded through training. What follows is a framework and research program proposal rather than a report of completed empirical results. I defend this thesis through four interconnected arguments that support three central contributions. Part I diagnoses the specification trap that makes content-based approaches structurally unstable.


Training Introspective Behavior: Fine-Tuning Induces Reliable Internal State Detection in a 7B Model

Rivera, Joshua Fonseca

arXiv.org Artificial Intelligence

Lindsey (2025) investigates introspective awareness in language models through four experiments, finding that models can sometimes detect and identify injected activation patterns -- but unreliably (~20% success in the best model). We focus on the first of these experiments -- self-report of injected "thoughts" -- and ask whether this capability can be directly trained rather than waiting for emergence. Through fine-tuning on transient single-token injections, we transform a 7B parameter model from near-complete failure (0.4% accuracy, 6.7% false positive rate) to reliable detection (85% accuracy on held-out concepts at α=40, 0% false positives). Our model detects fleeting "thoughts" injected at a single token position, retains that information, and reports the semantic content across subsequent generation steps. On this task, our trained model satisfies three of Lindsey's criteria: accuracy (correct identification), grounding (0/60 false positives), and internality (detection precedes verbalization). Generalization to unseen concept vectors (7.5pp gap) demonstrates the model learns a transferable skill rather than memorizing specific vectors, though this does not establish metacognitive representation in Lindsey's sense. These results address an open question raised by Lindsey: whether "training for introspection would help eliminate cross-model differences." We show that at least one component of introspective behavior can be directly induced, offering a pathway to built-in AI transparency.